


fe 



f: 



DOCUMENT P.FSUMB 



ED 039 092 


PE 002 437 


AUTHOR 


Johnson, Dale D. 


TITLE 


Vowel Cluster- Phon erne Correspondences in 20,000 
English Words. 


INSTITU'^ION 


Wisconsin Univ. , Madison. 


PUB DATE 


69 


NOTE 


5p. 


EDPS PRICE 


EDRS Price MF-$0o25 KC-$0.35 


DESCRIPTORS 


"^'Phonemes, ’S'Pr enunciation , =(«Spelling, ^Vowels, Word 
Frequency, =^Word Lists 


A PS '^R ACT 


The symbol-sound correspondence status of 



(two or more adjacent vowel letters) 



V owel-cl uster 

American English was investigated. The source of the 
Venezky’s 1963 revision of the Thorndike Frequency Co 
print-out of the 20,000 word corpus was analyzed to d 
ietter-sound correspondence of vowel cluster spelling 
pronunciations for each correspondence, as well as re 
word lists were compiled. The analysis revealed (1) T 
vowel clusters representing 92 different single vowel 
phoneme strings, producing more than 300 symbol-sound 
correspondences. (2) There was great variance in the 
61 vowel clusters. (3) Vowel clusters vary greatly in 
individual phonemes or phoneme strings they represent 
cluster pronunciations are. unpredictable from their s 



spelling in 
study was 
unt. A computer 
etermine the 
. Totals and 
presentati ve 
here are 61 
phonemes and 



frequency of the 
the number of 
c (4) Most vowel 
pelling. (WB) 











PJ 

o> 




o 

r<\ 




o 



lU 



ABSTRACT 

VOWEL CLUSTER - PHONEME CORRESPONDENCES 
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The research which is reported below was conducted at the University 
of Wisconsin in 19^9 with the guidance of Thomas Barrett and Richard 
Venezky* 



This study was designed to answer the question: ''What is the symbol- 

sound correspondence status of vowel cluster spelling in American English?" 

The source of the study was a corpus of 20,000 common English words - a | 



By Dale D. Johnson 

Assistant Professor 
Elementary Education 
Ball State University 
Munci e, Indi ana 



1963 Venezky revision of the Thorndike Frequency Count (1941)* A 
computer program was developed by Venezky which derived and tabulated 
all letter-sound correspondences within the corpus* 
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Venezky's unpublished computer print-out was analyzed by the investiga- 
tor, to determine the letter-sound correspondence of vowel cluster spellings, 
(2 or more adjacent vowel letters)* Totals and pronunciations for each 
correspondence, as well as representative word list were compiled* This 
analysis disclosed, among other things, the following: 

1* There are 6 l vowel clusters (including those containing the 
semi -vowels w and ^) in the corpus* 

2 * These 61 vowel clusters represent 92 different vowel phonemes and 
. phoneme strings, producing more than 300 symbol -sound corres- 

pondences* 
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About one-third of the words in the corpus contain vowel clusters* 
There is great variance in the frequency of the 61 vowel clusters* 
One, occurs in more than 1,000 words, while 26, ij^, eue, etc., 
occur in only one word* 

Vowel clusters vary greatly in the number of individual phonemes 
or phoneme strings they represent: some represent only one sound 

while one, ea , represents 17 sounds. 

Vowel clusters represent one phoneme about 80% of the time* 

Of the 26 most common vowel clusters, those occurring in 50 or 
more words, only 4 follow the “first vowel long, second vowel 
silent'* generalization in 75% or more of their occurrences - 
ai , ay, ee, and oa. Two more, ea and ow, adhere to this slightly 
more than 50% of the time* With the remaining 20, the generali- 
zation is seldom or never true* 

Host vowel cluster pronunciations are unpredictable from their 
spel lings* 
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The research which Is reported below was conducted at the University 
[ of Wisconsin In 19^9 with the guidance of Thomas Barrett and Richard 

Vanezky* 



Reading Includes the translation from spelling to sound, and the vowel 

! clusters (2 or more adjacent vowel letters) are perhaps the most complex 

and unpredictable components of the letter-sound correspondence code* 
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Vowel cluster spellings differ from single vowel spellings In several ways* 
They rarely appear before geminate consonant clustersi some, such as a£ 
and au^ occur Infrequently In word final position, while others, such as 

S' 

oa and 1e, rarely begin a %#ord In English* 
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Some vowel clusters have a major phonemic correspondent, and several 
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minor correspondents* For example, the major correspondent of Is /e/ 

I as In bal t* and represents this sound 85% of the time that It occurs* It 

represents A vlllaln i /at /, alsle i /a /, agaln i /* /, plaid* and others 
much less frequently* Other vowel clusters have two or more major corres- 
pondents, as well as minor correspondents* The vowel cluster ow Is /ot 
M In own 51% of the time and /aq/ as In owl l|8%* The only minor oorres- 
pendent Is as In knowledge * 
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By contrasty all singla voi#a1 spallings have timo major correspondanoas 
(a«g«t a is /a/ or /ae / as in rata and rat) plus several minor oorras- 



pondancas* 

As part of an interdisciplinary study of the reading process begun 
at Cornell University in I96l« Vanazky developed a computer program to 
derive and tabulate letter*sound correspondences as in a corpus of 
20^000 common English words (1963)* The 20^000 word corpus was a iM>di- 
fi cation of the most common 20^000 words according to the Thorndike 
Frequency Count (I94l)« Venezky omitted many archaic and low-frequency 
wordSf particularly proper nouns» and added a number of words in their 
place* Along with other information^ the computer analysis provided an 
inclusive tabulation of letter-sound correspondences found in the corpus 
as well as totals and percentages for each pronunciation in each word 
position^ and a complete word list for each correspondence* Av Pronouncing 
Pi ctionary of Mmri can Engl i sh (Kenyon and Knott, 1953) was used to 
determine the pronunciation of most words in the corpus* 

Venezky* s unpublished computer print-out of spel ling-to-sound corres- 
pondences in 20,000 English words was analyzed by this writer to determine 
letter-sound correspondences for contiguous vowels* This analysis 
disclosed the followings 

I* There are 6l vowel clusters (including those containing the 
semi -vowels w and ^** corpus* 

2* These 6l vowel clusters represent 92 different single vowel 

» 

phonemes end phoneme strings producing more than 300 symbol - 
sound correspondences* 
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For example^ oa represents /o/» /o/$ / 09 J and other phoneMes 
and phoneme 'strings* Yet each of these and others^ are repre* 
sented by a variety of spellings* Consequent! y« there are over 
300 symbol -sound correspondences# 

These 61 vowel clusters appear more than 6^000 times In 20^000 






word corpus# 

4* There Is great varlmice In the frequency of the 61 vowel clusters# 
One occurs In more than 1^000 words while 26 occur In three 
words or less* 



5* Vowel clusters vary greatly In the nuiiber of Individual phonemes 
or phoneme strings they represent; some represent only one 
sound while one represents 17 sounds* 

6# Most vowel cluster pronunciations are unpredictable from their 
spellings# 

7# Of the 61 vowel clusters^ 30 occur In 10 or more words In the 
corpus# Of these 30# 23 occur In words In whitih the vowel 
cluster Is sometimes disyllabic# Only six of these vowel 
clusters are disyllabic more often than monosyl table# Thus# 
these 30 vowel clusters# occurring In more than 6#000 words# 
represent single vowel phonemes about 80% of the time and two 
or more phonemes about 20%# 

If accepted# the presentation at IRA would Include the four most 
common pronunciations of each of the 26 vowel clusters which occur In 

I 

SO or more words# together with the percentages and number of words for 
each vowel cluster-sound correspondence# 













